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Abstract 

The  analysis  of  quantitative  properties,  such  as  timing  and  power,  is  central  to  the  design  of  reliable 
real-time  embedded  software  and  systems.  However,  the  verification  of  such  properties  on  a  program  is 
made  difficult  by  their  heavy  dependence  on  the  program’s  environment,  such  as  the  processor  it  runs  on. 
Modeling  the  environment  by  hand  can  be  tedious,  error-prone,  and  time  consuming.  In  this  paper,  we 
present  a  new,  game-theoretic  approach  to  analyzing  quantitative  properties  that  is  based  on  performing 
systematic  measurements  to  automatically  learn  a  model  of  the  environment.  We  model  the  estimation 
problem  as  a  game  between  our  algorithm  (player)  and  the  environment  of  the  program  (adversary), 
where  the  player  seeks  to  accurately  predict  program  properties  while  the  adversary  sets  environment 
parameters  to  thwart  the  player.  We  present  both  theoretical  and  experimental  evidence  for  the  utility  of 
our  game-theoretic  approach.  On  the  theoretical  side,  we  show  that  we  can  predict  the  program  property 
for  all  execution  paths  with  probability  greater  than  1  —  8  by  only  making  a  number  of  measurements 
that  is  polynomial  in  ln(l/8)  and  the  program  size.  Experimental  results  for  execution  time  analysis 
demonstrate  that  our  approach  is  efficient,  effective,  and  highly  portable. 


1  Introduction 

The  main  distinguishing  characteristic  of  embedded  computer  systems  is  their  tight  integration  with  the 
physical  world.  Consequently,  the  behavior  of  software  controllers  of  such  cyber-physical  systems  has  a 
major  effect  on  physical  properties  of  such  systems.  These  properties  are  quantitative,  including  constraints 
on  resources,  such  as  timing  and  power,  and  specifications  involving  physical  parameters,  such  as  position 
and  velocity.  The  verification  of  such  physical  properties  of  embedded  software  systems  requires  modeling 
not  only  the  softw  are  program  but  also  the  relevant  aspects  of  the  program’s  environment.  However,  only 
limited  progress  has  been  made  on  these  verification  problems.  One  of  the  biggest  obstacles  is  to  create  an 
adequately  accurate  model  of  a  complex  environment. 

Consider,  for  example,  the  problem  of  estimating  the  execution  time  of  a  softw  are  task.  This  problem 
plays  a  central  role  in  the  design  of  real-time  embedded  systems,  to  provide  timing  guarantees  and  for  use  in 
scheduling  algorithms.  In  spite  of  significant  research  on  this  topic  over  the  last  20  years  (e.g.  [14,  20]),  this 
problem  remains  far  from  solved.  The  complexity  arises  from  the  two  dimensions  of  the  problem:  the  path 
problem ,  which  is  to  find  the  worst-case  path  through  the  task,  and  the  state  problem ,  which  seeks  to  find  the 
worst-case  environment  state  to  run  the  task  from.  The  problem  is  particularly  challenging  because  these 
two  dimensions  interact  closely:  the  choice  of  path  affects  the  state  and  vice-versa.  Significant  progress 
has  been  made  on  this  problem,  especially  in  the  computation  of  bounds  on  loops  in  tasks,  in  modeling 
the  dependencies  amongst  program  fragments  using  (linear)  constraints,  and  modeling  some  aspects  of 
processor  behavior.  However,  as  pointed  out  in  recent  papers  by  Lee  [12]  and  Kirner  and  Puschner  [11],  it 
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is  becoming  increasingly  difficult  to  precisely  model  the  complexities  of  the  underlying  hardware  platform 
(e.g.,  out-of-order  processors  with  deep  pipelines,  branch  prediction,  caches,  parallelism)  as  well  as  the 
software  environment.  This  results  in  timing  estimates  that  arc  either  too  pessimistic  (due  to  conservative 
platform  modeling)  or  too  optimistic  (due  to  unmodeled  features  of  the  platform).  Industry  practice  typically 
involves  making  random,  unguided  measurements  to  obtain  timing  estimates.  As  Kirner  and  Puschner  [11] 
write,  a  major  challenge  for  measurement-based  techniques  is  the  automatic  and  systematic  generation  of 
test  data. 

In  this  paper,  we  present  a  new  game-theoretic  approach  to  verifying  physical  properties  of  embedded 
software  by  running  systematic  tests  of  the  software  in  its  target  environment,  and  learning  an  environ¬ 
ment  model.  The  following  salient  features  of  our  approach  distinguish  it  from  previous  approaches  in  the 
literature: 

•  Game-theoretic  formulation:  We  model  the  problem  of  estimating  a  physical  quantity  (such  as  time) 
as  a  multi-round  game  between  our  estimation  algorithm  (player)  and  the  environment  of  the  program 
(adversary).  The  physical  quantity  is  modeled  as  the  length  of  the  particular  execution  path  the  pro¬ 
gram  takes.  In  the  game,  the  player  seeks  to  estimate  the  length  of  any  path  through  the  program 
while  the  adversary  sets  environment  parameters  to  thwart  the  player.  Each  round  of  the  game  con¬ 
stitutes  one  test.  Over  many  rounds,  our  algorithm  learns  enough  about  the  environment  to  be  able 
to  accurately  predict  path  lengths  with  high  probability.  In  particular,  we  show  how  our  algorithm 
can  be  used  to  predict  the  longest  path  and  thus  predict  properties  such  as  worst-case  execution  time 
(WCET). 

•  Learning  an  environment  model:  A  key  component  of  our  approach  is  the  use  of  statistical  learning 
to  generate  an  environment  model  that  is  used  to  estimate  the  physical  quantity  of  interest.  The 
environment  is  viewed  as  an  adversary  that  selects  weights  on  edges  of  the  program's  control  flow 
graph  in  a  manner  that  can  depend  on  the  choice  of  the  path  being  tested.  This  path-dependency  is 
modeled  as  a  perturbation  of  weights  that  can  be  introduced  by  the  adversary.  Our  algorithm  seeks  to 
estimate  path  lengths  in  spite  of  such  adversarial  setting  of  weights.  The  algorithm  is  robust  not  only 
to  adversarial  choices  made  by  the  environment,  but  also  to  errors  in  measurement. 

•  Systematic  and  efficient  testing:  Another  central  idea  is  to  perform  systematic  measurements  of  the 
physical  quantity,  by  sampling  only  so-called  basis  paths  of  the  program.  The  intuition  is  that  the 
length  of  any  program  path  can  be  approximated  as  a  linear  combination  of  the  observed  lengths 
of  the  basis  paths.  We  use  satisfiability  modulo  theories  (SMT)  solvers  and  integer  programing  to 
generate  feasible  basis  paths  and  to  generate  test  inputs  to  drive  a  program's  execution  down  a  basis 
path. 

Although  our  focus  in  this  paper  is  on  software  analysis,  we  believe  that  the  above  concepts  arc  also  useful 
for  the  analysis  of  physical  properties  of  embedded  systems  in  general. 

We  present  both  theoretical  and  experimental  results  demonstrating  the  utility  of  our  approach.  On  the 
theoretical  side,  we  prove  that  if  we  run  a  number  of  tests  that  is  polynomial  in  the  input  size  and  Ini, 
our  algorithm  can  accurately  estimate  the  length  of  any  path  in  the  program  with  probability  1  —  8  (formal 
statement  in  Section  4).  In  particular,  we  can  use  this  result  to  estimate  the  length  of  the  longest  path  -  for 
timing,  this  amounts  to  estimating  the  worst-case  execution  time  (WCET).  More  generally,  we  show  that 
our  algorithm  can  estimate  the  length  of  all  program  paths  (i.e.  the  “timing  profile”  of  the  program)  and,  for 
any  £,  it  can  also  be  used  to  find  paths  of  length  within  £  of  the  longest. 

We  demonstrate  our  approach  for  the  problem  of  execution  time  analysis  of  embedded  software.  Our 
approach  is  implemented  in  a  tool  called  GameTime.  We  present  experimental  results  comparing  Game- 
Time  to  existing  state-of-the-art  WCET  estimation  tools  that  arc  based  on  combining  static  analysis  and 
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integer  programming.  Results  indicate  that  our  approach  can  generate  even  bigger  execution-time  estimates 
than  these  techniques,  without  incurring  the  difficulties  involved  in  modeling  complex  processor  behavior. 
Since  our  approach  is  measurement-based,  it  is  easy  to  apply  to  varied  and  complex  platforms.  Moreover, 
as  noted  above,  our  approach  can  be  used  not  just  for  worst-case  analysis,  but  also  to  predict  e-longest  paths 
and  for  predicting  execution  times  of  arbitrary  program  paths. 

For  concreteness,  we  focus  the  rest  of  the  paper  on  execution  time  analysis.  However,  the  theoretical 
formulation  and  results  in  Section  4  arc  potentially  applicable  for  estimating  any  physical  quantity  of  em¬ 
bedded  software;  we  have  therefore  sought  to  present  our  theoretical  results  in  a  general  manner  as  relating 
to  the  lengths  of  paths  in  a  graph. 

The  outline  of  the  paper  is  as  follows.  We  begin  with  a  survey  of  related  work  in  Section  2,  mainly 
focussed  on  execution  time  analysis.  The  basic  formulation  and  an  overview  of  our  approach  is  given  in 
Section  3.  The  algorithm  and  main  theorems  arc  given  in  Section  4,  and  experimental  results  in  Section  5. 
We  conclude  in  Section  6. 

A  preliminary  version  of  this  work  appeared  in  [22].  This  technical  report  expands  on  both  theoretical 
and  experimental  results,  describing  the  theoretical  model  in  far  greater  detail. 

2  Background  and  Related  Work 

We  briefly  review  literature  on  estimating  physical  parameters  of  software  and  relevant  results  from  learning 
theory. 

2.1  Estimating  Execution  Time  and  Other  Physical  Quantities 

There  is  a  vast  literature  on  estimation  execution  time,  especially  WCET  analysis,  comprehensively  surveyed 
by  Li  and  Malik  [14]  and  Wilhelm  et  al.  [27,  20].  For  lack  of  space,  we  only  include  here  a  brief  discussion 
of  current  approaches  and  do  not  cover  all  tools.  References  to  current  techniques  can  be  found  in  a  recent 
survey  [20]. 

There  arc  two  parts  to  current  WCET  estimation  methods:  program  path  analysis  (also  called  control 
flow  analysis )  and  processor  behavior  analysis.  In  program  path  analysis,  the  tool  tries  to  find  the  program 
path  that  exhibits  worst-case  execution  time.  In  processor  behavior  analysis  (PBA),  one  models  the  details  of 
the  platform  that  the  program  will  execute  on,  so  as  to  be  able  to  predict  environment  behavior  such  as  cache 
misses  and  branch  mis-predictions.  PBA  is  an  extremely  time-consuming  process,  with  several  man-months 
required  to  create  a  reliable  timing  model  of  even  a  simple  processor  design. 

Current  tools  arc  broadly  classified  into  those  based  on  static  analysis  (e.g.,  aiT,  Bounds-T,  SWEET, 
Chronos)  and  those  that  arc  measurement-based  (e.g.,  RapiTime,  SymTA/P,  Vienna  M./R).  Static  tools 
rely  on  abstract  interpretation  and  dataflow  analysis  to  compute  facts  at  program  points  that  identify  de¬ 
pendencies  between  code  fragments  and  generate  loop  bounds.  Even  static  techniques  use  measurement 
for  estimating  the  time  for  small  program  fragments,  and  measurement-based  techniques  rely  on  techniques 
such  as  model  checking  to  guide  path  exploration.  Static  techniques  also  perform  implicit  path  enumeration 
(termed  “IPET”),  usually  based  on  integer  linear  programming.  The  state-of-the-art  measurement-based 
techniques  [26]  arc  based  on  generating  test  data  by  a  combination  of  program  partitioning,  random  and 
heuristic  test  generation,  and  exhaustive  path  enumeration  by  model  checking. 

Our  technique  is  measurement-based ;  hence,  it  suffers  no  over-estimation  and  is  easy  to  port  to  a  new 
platform.  It  is  distinct  from  existing  measurement-based  techniques  due  to  the  novel  game-theoretic  formu¬ 
lation,  basis  path-based  test  generation,  and  the  use  of  online  learning  to  infer  an  environment  model.  Our 
approach  does  rely  on  some  static  techniques,  in  deriving  loop  bounds  and  using  symbolic  execution  and 
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satisfiability  solvers  to  compute  inputs  to  drive  the  program  down  a  specific  path  of  interest.  In  particular, 
note  that  our  approach  completely  avoids  the  difficulties  of  processor  behavior  analysis,  instead  directly 
executing  the  program  on  its  target  platform.  Moreover  our  approach  applies  not  just  to  WCET  estimation, 
but  also  to  estimating  the  entire  execution  time  profile  of  a  program. 

While  there  have  been  several  papers  about  quantitative  verification  of  formal  models  of  systems  (e.g.  [5]), 
these  typically  assume  that  the  quantitative  parameters  of  primitive  elements  (such  as  execution  time  of  soft¬ 
ware  tasks)  arc  given  as  input.  There  is  relatively  little  work  on  directly  verifying  non-timing  properties  on 
software,  with  the  exception  of  estimating  the  power  used  by  software-controlled  embedded  systems  [24]. 

Adversarial  analysis  has  been  employed  for  problems  such  as  system-level  dynamic  power  manage¬ 
ment  [10],  but  to  our  knowledge,  the  adversarial  model  and  analysis  used  in  this  paper  is  the  first  for  timing 
estimation  and  for  estimating  quantitative  parameters  of  software. 

2.2  Learning  Theory 

Results  of  this  paper  build  on  the  game-theoretic  prediction  literature  in  learning  theory.  This  field  has 
witnessed  an  increasing  interest  in  sequential  (or  online )  learning,  whereby  an  agent  discovers  the  world  by 
repeatedly  acting  and  receiving  feedback.  Of  particular  interest  is  the  problem  of  learning  in  the  presence  of 
an  adversary  with  a  complete  absence  of  statistical  assumptions  on  the  nature  of  the  observed  data. 

The  problem  of  sequentially  choosing  paths  to  minimize  the  regret  (the  difference  between  cumulative 
lengths  of  the  paths  chosen  by  our  algorithm  and  the  total  length  of  the  longest  path  after  T  rounds)  is  known 
as  an  instance  of  bandit  online  linear  optimization.  The  “bandit”  part  of  the  name  is  due  to  the  connection 
with  the  multi-armed  bandit  problem,  where  only  the  payoff  of  the  chosen  “ai  m”  (path)  is  revealed.  The 
basic  “bandit”  problem  was  put  forth  by  Robbins  [21]  in  1952  and  has  been  well-understood  since  then. 
The  recent  progress  comes  from  the  realization  that  well-performing  algorithms  can  be  found  (a)  for  large 
decision  spaces,  such  as  paths  in  a  graph,  and  (b)  under  adversarial  conditions  rather  than  the  stochastic 
formulation  of  Robbins.  We  arc  the  first  to  bring  these  results  to  bear  on  the  problem  of  quantitative  analysis 
of  embedded  software. 

We  refer  the  reader  to  a  recent  book  [4]  for  a  comprehensive  treatment  of  sequential  prediction.  Some 
relevant  results  can  be  found  in  [17,  9,  1], 

2.3  Miscellaneous 

Our  algorithm  uses  the  concept  of  basis  paths  of  a  program,  which  has  been  explored  before  in  the  software 
engineering  community  to  compute  the  cyclomatic  complexity  of  a  program  [16];  however,  our  theoretical 
results  rely  on  extracting  a  special  basis  called  a  bary centric  spanner  [1],  Our  approach  heavily  relies  on  ad¬ 
vances  in  SMT  solving  for  input  test  generation;  these  techniques  arc  surveyed  in  a  recent  book  chapter  [2], 

3  Theoretical  Formulation  and  Overview 

We  are  concerned  with  estimating  a  physical  property  of  a  software  task  (program)  executing  in  its  target 
platform  (environment).  The  physical  quantity  of  interest  is  in  general  a  function  of  three  things:  the  program 
code,  parameters  of  its  environment,  and  the  inputs  to  the  program.  More  concisely,  we  can  express  the 
physical  quantity  q  as  the  following  function 


cl  =  fp{x,w) 
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where  x  denotes  the  inputs  to  the  program  (such  as  data  read  from  memory  or  received  over  the  network),  w 
denotes  the  environment  parameters  (such  as  the  contents  of  the  cache  or  network  delays),  and  f)>  denotes 
the  program-specific  function  that  maps  x  and  w  to  a  value  of  the  physical  quantity. 

In  general,  x  and  w  vary  over  time,  and  so  does  q.  However,  the  function  J)>  is  typically  constant  over 
time.  We  will  make  the  variation  with  time  explicit  by  adding  a  subscript: 

qt  =fp(xt,wt ) 


Some  sample  physical  properties  of  interest  arc  as  follows: 


•  Global  worst-case  estimation:  In  this  case,  we  want  to  estimate  the  largest  value  of  the  quantity  q  for 
all  values  of  x  and  w: 

max  fP(x,w)  (1) 

X,W 

•  Worst-case  estimation  over  a  time  horizon  x:  This  is  a  similar  problem  as  above,  except  that  the  worst 
case  is  to  be  computed  over  a  finite  time  horizon  x,  formally  specified  as  follows: 

max  max  /p(x(,w()  (2) 

r=l..i  xt,wt 


•  Average-case  estimation  over  a  time  horizon  against  a  worst-case  environment:  In  this  case,  we  want 
to  estimate,  for  a  time  horizon  x  and  for  any  sequence  of  environment  parameters  wi,W2,  ■  ■  ■ ,  wx,  the 
following  quantity: 


max  -Yfp(xt,wt) 

X,  X  “ 


(3) 


•  Can  the  system  consume  R  resources  at  any  point  over  a  time  horizon  ofx:  The  question  we  ask  here  is 
whether  qt  exceeds  R  for  any  choice  of  t,xt,  and  wt.  For  example,  a  concrete  instance  of  this  problem 
is  to  ask  whether  a  software  task  can  take  more  than  R  seconds  to  execute. 


For  concreteness,  in  the  remainder  of  this  section,  we  will  focus  on  a  single  quantity,  execution  time , 
and  on  a  single  representative  problem,  namely,  the  worst-case  execution  time  (WCET)  estimation  problem. 
However,  our  theoretical  formulation  and  algorithms  carry  over  to  estimating  any  physical  quantity  and  to 
problems  other  than  worst-case  analysis. 

The  WCET  estimation  problem  can  be  defined  as  follows: 

Given  a  terminating  software  task  S  and  a  platform  M  on  which  S  executes,  estimate  the  longest 
time  S  takes  to  terminate  on  M. 


Moreover,  we  will  focus  on  WCET  estimation  over  a  finite  time  horizon  x.  If  we  let  x  go  to  °o,  this 
problem  reduces  to  the  true  WCET  estimation  problem.  For  brevity,  we  will  simply  refer  to  finite-horizon 
WCET  estimation  as  WCET  estimation;  however,  our  experimental  results  compare  against  techniques  for 
the  true  WCET  estimation  problem. 

The  main  ideas  in  our  theoretical  formulation  are  elaborated  below. 

Game-theoretic  formulation:  We  model  the  WCET  estimation  problem  as  a  game  between  the  WCET 
estimation  tool  T  and  the  environment  £  of  S. 

The  game  proceeds  over  multiple  rounds,  t  =  1,2,3, _  In  each  round,  T  picks  the  inputs  x  to  S. 

These  inputs  determine  the  path  taken  through  the  program.  £  picks,  in  a  potentially  adversarial  fashion, 
environment  parameters  w.  This  choice  by  £  can  depend  on  the  inputs  selected  by  T . 
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At  the  end  of  each  round  t,  1  receives  as  feedback  the  execution  time  l,  of  S  for  its  chosen  path  under 
the  parameters  chosen  by  E .  Note  that  we  assume  that  T  only  receives  the  overall  execution  time  of  the 
task,  not  a  more  fine-grained  measurement  of  (say)  each  basic  block  in  the  task  along  the  chosen  path.  This 
enables  us  to  minimize  any  skew  from  instrumentation  inserted  to  measure  time.  Based  on  the  feedback  lt, 
1  can  modify  its  input-selection  strategy. 

After  some  number  of  rounds  x,  we  stop:  T  must  output  its  prediction  of  the  longest  execution  time  of 
S  that  could  have  been  exhibited  during  rounds  t  =  1,2, ...  ,x.  T  wins  the  game  if  its  prediction  is  correct; 
otherwise,  E  wins.  The  goal  of ‘T  is  thus  to  select  a  sequence  of  inputs  so  that  it  can  accumulate  enough 
data  to  identify,  with  high  probability,  the  longest  execution  time  ofS  during  t  =  1,2, ...  ,T. 

Note  that  this  longest  execution  time  need  not  be  due  to  inputs  that  have  been  already  tried  out  by  T . 

By  permitting  E  to  select  environment  parameters  based  on  T’s  choice  of  path,  we  can  model  path- 
dependent  timing  as  well  as  perturbation  in  execution  time  of  a  single  path  due  to  variation  in  environmental 
conditions  or  measurement  error.  The  more  predictable  the  timing  behavior  of  the  platform,  the  smaller 
this  perturbation  will  be.  For  theoretical  analysis,  we  model  the  perturbation  as  a  random  variable  whose 
mean  is  bounded  by  a  parameter  pmax.  If  a  platform  has  predictable  timing,  such  as  the  PRET  processor 
proposed  by  Edwards  and  Lee  [6],  it  would  mean  that  pmax  is  small.  (The  pmax  parameter  will  play  a  role  in 
determining  the  rate  of  convergence  of  our  proposed  algorithm.) 

Formulation  as  a  graph  problem:  An  additional  aspect  of  our  model  is  that  the  game  operates  on  the 
control-flow  graph  Gs  of  the  task  5,  with  loops  unrolled  to  a  pre-determined  safe  upper  bound. 

In  this  setting,  the  game  described  above  works  as  follows.  At  any  round  l,  the  player  T  selects  a  path  xt 
through  the  graph  Gs  from  a  designated  source  node  (entry  point  of  the  function)  to  a  designated  sink  node 
(exit  point/return  statement  of  the  function).  This  is  performed  by  generating  input  values  for  S  to  drive 
execution  down  path  xt,  using  standard  constraint-based  test  generation  techniques  using  SMT  solvers.  E 
selects  lengths  for  all  source-sink  paths  in  Gs ,  where  this  selection  can  depend  on  the  choice  of  x,.  However, 
E  only  reveals  the  length  l,  of  the  chosen  path  xt. 

The  goal  of  T  is  thus  to  select  paths  so  that  within  a  time  horizon  x  it  can  accumulate  enough  data  to 
identify,  with  high  probability,  the  longest  path  in  Gs  during  rounds  t  =  1,2, ...  ,x. 

Next,  we  formalize  the  above  problem  definition. 

3.1  Theoretical  Formulation 

Consider  a  directed  acyclic  graph  G  =  (V.E)  derived  from  the  control-flow  graph  of  the  task  with  all  loops 
unrolled.  We  will  assume  that  there  is  a  single  source  node  u  and  single  sink  node  v  in  G;  if  not,  then  dummy 
source  and  sink  nodes  can  be  added. 

Let  E  denote  the  set  of  all  paths  in  G  from  source  u  to  sink  v.  We  can  associate  each  of  the  paths  with 
a  binary  vector  with  m  =  \E\  components,  depending  on  whether  the  edge  is  present  or  not.  In  other  words, 
each  source-sink  path  is  a  vector  x  in  {0, 1}'”,  where  the  /th  entry  of  the  vector  for  a  path  ,r  corresponds  to 
edge  i  of  G,  and  is  1  if  edge  i  is  in  x  and  0  otherwise.  The  set  E  is  thus  a  subset  of  {0, 1}'". 

The  path  prediction  interaction  is  modeled  as  a  repeated  game  between  our  algorithm  (‘T )  and  the  pro¬ 
gram  environment  (E).  On  each  round  t,  T  chooses  a  path  x,  £  E  between  u  and  v.  Concurrent  with  this 
choice,  the  adversary  E  picks  a  table  of  non-negative  path  lengths  given  by  the  function  Lt  :  E  — ►  M-°. 
Then,  the  total  length  l,  of  the  chosen  path  x,  is  revealed,  where  l,  =  Lt(xt).  The  game  proceeds  for  some 
number  of  rounds  t  =  1 , 2, . . . ,  x. 

At  the  end  of  round  x,  the  goal  of  T  is  to  accurately  estimate  the  worst-case  execution  time  due  to 
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environment  states  in  rounds  t  =  1,2,... ,  x.  This  can  be  expressed  as  the  following  quantity: 

Tmax  =  max  max  L,(x)  (4) 

xev  t=  1,2,... ,x 

Moreover,  we  would  also  like  T  to  identify  the  worst-case  path  given  by 

x*  =  argmax  F2,  max  Lt  (x)  (5) 

We  make  a  few  remarks  on  the  above  theoretical  model. 

First,  we  stress  that,  in  the  above  formulation,  the  goal  is  to  find  the  WCET  due  to  environment  states  in 
rounds  t  =  1 , 2, . . . ,  x.  In  order  to  find  the  true  WCET,  for  all  possible  environment  states,  we  need  to  assume 
that  the  worst-case  state  occurs  at  some  time  between  t  =  1  and  t  =  t.  We  contend  that  this  formulation  is 
useful  in  spite  of  this  assumption  because  it  serves  to  decouple  the  path  dimension  of  the  WCET  estimation 
problem  from  the  state  dimension.  In  our  experience,  for  many  applications,  the  worst-case  environment 
state  does  appear-  at  some  time  during  testing  —  the  problem  is  that  testing  may  not  pick  the  worst-case  path 
at  that  same  time.  With  our  formulation,  the  goal  is  to  accurately  estimate  the  WCET  even  if  we  do  not 
sample  the  worst-case  path  when  the  worst-case  state  occurred. 

Second,  the  definition  of  our  estimation  target  £max  assumes  that  the  timing  of  a  program  depends  only 
on  the  control  flow  through  that  program.  In  general,  the  timing  can  also  depend  on  characteristics  of  input 
data  that  do  not  influence  control  flow.  We  believe  that  the  basic  framework  we  describe  here  also  applies  to 
the  case  of  data-dependent  timing,  and  leave  an  exploration  of  this  aspect  to  future  work. 

Overall,  we  believe  that  decoupling  the  path  problem  from  the  state  problem  in  a  manner  that  can  be 
applied  easily  to  any  platform  is  in  itself  a  significant  challenge.  This  paper  mainly  focuses  on  solving  this 
problem.  In  future  work,  we  plan  to  address  the  limitations  of  the  model  identified  above. 

The  third  and  final  remark  we  make  is  about  the  “size”  of  the  theoretical  model.  Since  a  DAG  can  have 
exponentially-many  paths  in  the  number  of  nodes  and  edges,  the  domain  of  the  function  L,  is  potentially 
exponential,  and  can  change  at  each  round  t.  In  the  worst  case,  the  strategy  sets  of  both  T  and  £  in  this 
model  are  exponential-sized,  and  it  is  impossible  to  exactly  learn  Lt  for  every  t  without  sampling  all  paths. 
Hence,  we  need  to  approximate  the  above  model  with  another  model  that,  while  being  more  compact,  retains 
enough  accuracy  to  generate  useful  results  in  practice. 

Below,  we  present  a  more  compact  model,  which  our  algorithm  is  then  based  upon.  We  will  present  this 
model  in  two  steps. 

3.1.1  Modeling  with  Weights  and  Perturbation 

We  model  the  selection  of  the  table  of  lengths  Lt  by  the  environment  £  as  a  two-step  procedure. 

(i)  First,  £  chooses  a  vector  of  non-negative  edge  weights,  w,  e  M"f,  for  G.  These  weights  represent 
path-independent  delays  of  basic  blocks  in  the  program. 

(ii)  Then,  after  observing  the  path  x,  selected  by  T ,  £  picks  a  distribution  from  which  it  draws  a  pertur¬ 
bation  vector  7 lt(xt).  The  functional  notation  indicates  that  the  distribution  is  a  function  of  xt. 

The  vector  tz,  (xt )  models  the  path-specific  changes  that  £  applies  to  its  original  choice  wt.  We  will 
abbreviate  ti,  (x, )  by  Kxt.  In  cases  where  we  wish  to  denote  tz,  (V)  for  x!  that  could  be  different  from 
x,,  we  will  explicitly  write  TZxt(x')  or  ti,  (x'). 

The  only  restriction  we  place  on  TZ,  (x),  for  any  x,  is  that  j  n,  (x)  1 1 1  <N,  for  some  finite  N.  The  parameter 
N  is  arbitrary,  but  places  the  constraint  that  the  perturbation  of  any  path  length  cannot  be  unbounded. 
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Thus,  the  overall  path  length  observed  by  T  is 


If  —  Xt  •  (wt  +  7TVf)  —  XfT (\Vt  +  7C Xf) 

Now  let  us  consider  how  this  model  relates  to  the  original  formulation  we  started  with. 

First,  note  that,  in  the  original  model,  £  picks  the  function  A  that  defines  the  lengths  of  all  paths.  To 
relate  to  that  model,  here  we  can  assume,  without  loss  of  generality,  that  £  draws  a-priori  the  perturbation 
vectors  nx,  (x)  for  al  I  x  G  £ ,  but  only  nA,  (x, )  plays  a  role  in  determining 
Second,  equating  the  observed  lengths,  we  see  that 

Lt(xt)  =xJ(w,  +  KXt) 

The  main  constraint  on  this  equation  is  the  requirement  that  1 1 7tAV  1 1 1  <  /V,  which  implies  that  \xj 7tA,  <  N. 

In  effect,  by  using  this  model  we  require  that  £  pick  A  by  first  selecting  path-independent  weights  wt  and 
then,  for  each  source-sink  path,  modifying  its  length  by  a  perturbation  of  at  most  ±N.  Note,  however,  that 
the  model  places  absolutely  no  restrictions  on  the  value  of  wr  or  how  it  changes  with  /  (from  round  to  round). 

The  goal  for  T  in  this  model  is  to  estimate  the  following  quantity 

imas  =  max  max  xT(wt  +  i?t)  (6) 

XET  t=  1,2 . T 

Moreover,  we  would  also  like  T  to  identify  the  worst-case  path  given  by 

x*  =  argmaxx€S,  max  xT(wt  +  lft)  (7) 

3.1.2  Simplified  Model  without  Perturbation 

To  more  easily  introduce  the  key  concepts  in  our  algorithm,  we  will  initially  assume  that  the  perturbation 
vectors  at  all  time  points  are  identically  0,  viz.,  nxr(x)  =  0  for  all  t  and  x. 

Clearly,  this  is  an  unrealistic  idealization  in  practise,  since  in  this  model  the  length  of  an  edge  is  inde¬ 
pendent  of  the  path  it  lies  on.  We  stress  that  our  main  theoretical  results  are  for  the  more  realistic  model 
defined  in  Section  3.1.1. 

We  next  give  an  overview  of  our  approach  in  the  context  of  a  small  example. 

3.2  Overview  of  Our  Approach 

We  describe  the  working  of  our  approach  using  a  small  program  from  an  actual  real-time  embedded  system, 

the  Paparazzi  unmanned  aerial  vehicle  (UAV)  [18].  Figure  1  shows  the  C  source  code  for  the  altitude_control_task 

in  the  Paparazzi  code,  which  is  publicly  available  open  source. 

Starting  with  the  source  code  for  a  task,  and  all  the  libraries  and  other  definitions  it  relies  on,  we  run  the 
task  through  a  C  pre -processor  and  the  CIL  front-end  [8]  and  extract  the  control-flow  graph  (CFG).  In  this 
graph,  each  node  corresponds  to  the  start  of  a  basic  block  and  edges  arc  labeled  with  the  basic  block  code 
or  conditional  statements  that  govern  control  flow.  Figure  2  shows  the  CFG  for  the  code  shown  in  Figure  1 . 

Note  that  we  assume  that  code  terminates,  and  bounds  arc  known  on  all  loops.  Thus,  we  start  with  code 
with  all  loops  (if  any)  unrolled,  and  the  CFG  is  thus  a  directed  acyclic  graph  (DAG).  We  also  pre-process  the 
CFG  so  that  it  has  exactly  one  source  and  one  sink.  Each  execution  through  the  program  is  a  source-to-sink 
path  in  the  CFG. 

An  exhaustive  approach  to  program  path  analysis  will  need  to  enumerate  all  paths  in  this  DAG.  However, 
it  is  well-known  that  a  DAG  can  have  exponentially  many  paths  (in  the  number  of  vertices/edges).  Thus,  a 
brute-force  enumeration  of  paths  is  not  efficient. 


#def ine  PPRZ_M0DE_AUT02  2 
#def ine  PPRZ_MODE_HOME  3 
#def ine  VERTICAL_MODE_AUTO_ALT  3 
#def ine  CLIMB_MAX  1.0 

void  altitude_control_task(void)  { 
if  (pprz_mode  ==  PPRZ_M0DE_AUT02 

I |  pprz_mode  ==  PPRZ_M0DE_H0ME)  { 
if  (vertical _mode  ==  VERTICAL_M0DE_AUT0_ALT)  { 

/*  inlined  below:  function  altitude_pid_run() ;  */ 
float  err  =  estimator_z  -  desired_altitude ; 
desired_climb  =  pre_climb  +  altitude_pgain  *  err; 
if  (desired_climb  <  -CLIMB_MAX) 
desired_climb  =  -CLIMB_MAX; 
if  (desired_climb  >  CLIMB_MAX) 
desired_climb  =  CLIMB_MAX; 

»> 


Figure  1:  Source  code  for  altitude_control_task 


xl  =(1, 1,1,0, 0,1, 1,0, o,i) 

x2  =  (1,0,0,1.1.0,04,1,1) 
x3  =  (1,1,1,0,0,0,04,1,1) 
x4  =  (1,0, 0,1, 1,1,1, 0,0,1) 

x4  =  xl  +  x2  -  x3 


Our  approach  is  to  sample  a  set  of  basis  paths.  Re¬ 
call  that  each  source-sink  path  can  be  viewed  as  a  vector 
in  {0, 1}"\  where  m  is  the  number  of  edges  in  the  unrolled 
CFG.  The  set  of  all  valid  source-sink  paths  thus  forms  a  sub¬ 
set  T  of  {0,  l}m.  We  compute  the  basis  for  tP  in  which  each 
element  of  the  basis  is  also  a  source-sink  path. 

Figure  3  illustrates  the  ideas  using  a  simple  “2-diamond” 
example  of  a  CFG.  In  this  example,  paths  x\ ,  xi  and  X3  form 
a  basis  and  X4  can  be  expressed  as  the  linear  combination 
X\  +X2  ~  X3. 

Our  algorithm,  described  in  detail  in  Section  4,  ran¬ 
domly  samples  basis  paths  of  the  CFG  and  drives  program 
execution  down  those  paths  by  generating  tests  using  con¬ 
straint  solving.  From  the  observed  lengths  of  those  paths, 
we  estimate  edge  weights  on  the  entire  graph.  This  estimate, 
accumulated  over  several  rounds  of  the  game,  is  then  used 
to  predict  the  longest  source-sink  path  in  the  CFG.  Theoretical  guarantees  on  performance  arc  proved  in 
Section  4  and  experimental  evidence  for  its  utility  is  given  in  Section  5. 


Figure  3:  Illustration  of  Basis  Paths.  An  edge 
label  indicates  the  position  for  that  edge  in  the  vector 
representation  of  a  path. 


4  Algorithm  and  Theoretical  Results 

Recall  that,  in  the  model  introduced  in  the  previous  section,  the  path  prediction  interaction  is  modeled  as  a 
repeated  game  between  our  algorithm  (Player)  and  the  program  environment  (Adversary).  On  each  round  t, 
we  choose  a  source-sink  path  xt  G  fP  C  {0, 1}"'.  The  adversary  chooses  the  lengths  of  paths  in  the  graph.  We 
assume  that  this  choice  is  made  by  the  following  two  stage  process:  first,  the  adversary  chooses  the  worst- 
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Figure  2:  Control-flow  graph  for  altitude_control_task 

case  weights,  wt  €  R"\  on  the  edges  of  G  independently  of  our  choice  x, ,  and  then  skews  these  weights  by 
adding  a  random  perturbation  Kxt,  whose  distribution  depends  on  xt.  (We  will  also  refer  to  edge  weights  and 
path  lengths  as  “delays”,  to  make  concrete  the  link  to  timing  analysis.) 

In  the  simplified  model,  which  we  consider  first,  we  suppose  that  the  perturbation  is  zero;  thus,  we 
observe  the  overall  path  length  /,  =  xjwt.  In  the  general  model,  only  lt  =  xj  (wt  +  nx, )  is  observed.  No 
other  information  is  provided  to  us;  not  only  do  we  not  know  the  lengths  of  the  paths  not  chosen,  we  do  not 
even  know  the  contributions  of  particular-  edges  on  the  chosen  path.  It  is  important  to  emphasize  that  in  the 
general  model  we  assume  that  the  adversary  is  adaptive  in  that  w,  and  nxt  can  depend  on  the  past  history  of 
choices  by  the  player  and  the  adversary. 

Suppose  that  there  is  a  single  fixed  path  x*  which  is  the  longest  one  on  each  round.  One  possible 
objective  is  to  find  x*.  In  the  following,  we  exhibit  an  efficient  randomized  algorithm  which  allows  us  to 
find  it  correctly  with  high  probability.  In  fact,  our  results  are  more  general:  if  no  single  longest  path  exists, 
we  can  provably  find  a  batch  of  longest  paths.  We  describe  later  how  our  theoretical  approach  paves  the  way 
for  analyzing  worst-case  execution  time  given  a  range  of  assumptions  at  hand. 
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Before  diving  into  the  details  of  the  algorithm,  let  us  sketch  how  it  works: 

•  First,  compute  a  representative  set  of  basis  paths,  called  a  bary centric  spanner  (see  section  4.1) 

•  For  a  specified  number  of  iterations  x,  do  the  following: 

*  pick  a  path  from  the  representative  set 

*  observe  its  length 

*  construct  an  estimate  of  edge  weights  on  the  whole  graph  from  the  observed  length 

•  Find  the  longest  path  or  a  set  of  longest  paths  based  on  the  average  of  the  estimates  over  x  iterations. 

It  might  seem  mysterious  that  we  can  re-construct  edge  weights  (delays,  for  the  case  of  timing  analysis)  on 
the  whole  graph  based  a  single  number,  which  is  the  total  length  of  the  path  we  chose.  To  achieve  this,  our 
method  exploits  the  power  of  randomization  and  a  careful  choice  of  a  representative  set  of  paths.  The  latter 
choice  is  discussed  next. 

4.1  Focusing  on  a  Barycentric  Spanner 

It  is  well-known  in  the  game-theoretic  study  of  path  prediction  that  any  deterministic  strategy  against  an 
adaptive  adversary  will  fail  [4],  Therefore,  the  algorithm  we  present  below  is  randomized.  As  we  only 
observe  the  entire  length  of  the  path  we  choose,  we  arc  bound  to  select  from  the  set  of  paths  covering  the 
whole  graph  or  else  we  risk  missing  a  highly  time-consuming  edge.  However,  simply  covering  the  graph  is 
not  enough  -  note  that  such  coverage  corresponds  to  “statement  coverage”  in  the  program,  without  covering 
all  ways  of  getting  to  a  statement.  Indeed,  a  key  feature  of  the  algorithm  is  the  ability  to  exploit  correlations 
between  paths  to  guarantee  that  we  find  the  longest.  Hence,  we  need  a  barycentric  spanner  (introduced 
by  [1]),  a  set  of  up  to  m  paths  with  two  valuable  properties:  any  path  in  the  graph  can  be  written  as  a 
lineal-  combination  of  the  paths  in  the  spanner,  and  the  coefficients  in  this  linear  combination  are  bounded 
in  absolute  value.  The  first  requirement  says  that  the  spanner  is  a  good  representation  for  the  exponentially- 
lai'ge  set  of  possible  paths;  the  second  says  that  lengths  of  some  of  the  paths  in  the  spanner  will  be  of  the 
same  order  of  magnitude  as  the  length  of  the  longest  path.  These  properties  enable  us  to  repeatedly  sample 
from  the  barycentric  spanner  and  reconstruct  delays  on  the  whole  graph.  We  then  employ  concentration 
inequalities1  to  prove  that  these  reconstructions,  on  average,  converge  to  the  true  delays  of  the  paths.  Once 
we  have  a  good  statistical  estimate  of  the  true  weights  on  all  the  edges,  it  only  remains  to  run  a  longest-path 
algorithm  for  weighted  directed  acyclic  graphs  (longest-path),  subject  to  path  feasibility  constraints. 

The  existence  of  a  barycentric  spanner  has  been  shown  in  Awerbuch  and  Kleinberg  [1],  In  particular',  the 
authors  provide  the  following  procedure  to  find  a  2-barycentric  spanner  set  (where  coefficients  are  bounded 
in  absolute  value  by  2)  {Zq, . . .  ,bm}  €  2  (see  also  [17]). 

In  Algorithm  1 .  B  =  (/?]... . ,  bm )  and  B  ,  =  (Zq, . . . ,  Z?/_  | ,  bl+  \ , . . . .  bm ) .  The  output  of  the  algorithm  is 
the  final  value  of  B,  a  2-barycentric  spanner;  i.e.,  any  path  x  G  tP  can  be  written  as  x  =  Yli=\  OfiZ?,  with  |a,j  <  2. 
The  /th  iteration  of  the  for-loop  in  lines  2-4  repeatedly  replaces  the  /th  element  of  the  standard  basis  with 
a  path  that  is  orthogonal  to  the  previous  i  —  1  paths  identified  so  far  and  with  all  remaining  standard  basis 
vectors  and  also  spans  the  path-space  <2 .  Line  3  of  the  algorithm  corresponds  to  maximizing  a  linear  function 
over  the  set  2,  and  can  be  solved  using  longest-path.2  At  the  end  of  the  for-loop,  we  are  left  with  a  basis 
of  2  that  is  not  necessarily  a  2-barycentric  spanner.  Lines  5-7  of  the  algorithm  refine  this  basis  into  a 
2-barycentric  spanner  using  the  same  longest-path  optimization  oracle  that  is  used  in  the  for-loop. 

Concentration  inequalities  are  sharp  probabilistic  guarantees  on  the  deviation  of  a  function  of  random  variables  from  its  mean. 

2In  practise,  to  compute  feasible  basis  paths  one  must  add  constraints  that  rule  out  infeasible  paths,  as  is  standard  in  integer 
programming  formulations  for  timing  analysis  [  1 4] ;  in  this  case,  the  longest-path  computation  is  solved  as  an  integer  linear  program. 
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Algorithm  1  Finding  a  2-Barycentric  Spanner 


1: 

bfyi)  ■<— 

(ci  ? . . . , 

2: 

for  i  =  1  to  m 

do  {{Compute 

a  basis  of  rP  }  } 

3: 

bj  <—  arg  maxAc2.  det(x,B_ 

01 

4: 

end  for 

5: 

while  3x  G  i’ 

i  G  {!,...  ,m} 

satisfying 

|det(x,B_i)|  >  2|det(4,£_i)| 

do  {{Transform  B  into  a  2-barycentric  spanner}} 

6: 

bi  <—  x 

7: 

end  while 

Algorithm  2  GameTime  with  simplified  environment  model 
1 :  Input  x  G  N 

2:  Compute  a  2-barycentric  spanner  {b\ , . . . , b/,} 

3:  for  t  =  1  to  x  do 
4:  Environment  chooses  wt. 

5:  We  choose  it  G  { I .....  h}  uniformly  at  random. 

6:  We  predict  the  path  xt  =  b\t  and  observe  the  path  length  lt  =  bjwt 

1:  Estimate  vt  G  M.b  as  vt  =  b£t  ■  e,-f,  where  { e,- }  denotes  the  standard  basis. 

8:  Compute  estimated  weights  wt  =  B+vt 

9:  end  for 

10:  Use  the  obtained  sequence  uq .  . .  w%  to  find  a  longest  path(s).  For  example,  for  Theorem  4.2,  we  compute 
x*  :  =  arg  maxx€2>  xT  ,  w, . 


One  can  intuitively  view  the  determinant  computation  as  computing  the  volume  of  the  corresponding 
polytope.  Maximizing  the  determinant  amounts  to  spreading  the  vertices  of  the  polytope  as  far  as  possible 
in  order  to  obtain  a  “diverse”  set  of  basis  paths. 

It  is  shown  [1]  that  the  running  time  of  Algorithm  1  is  only  quadratic  in  m.  Gyorgy  et  al.  [9]  extend 
the  above  procedure  to  the  case  where  the  set  of  paths  spans  only  a  b-dimensional  subspace  of  Mm  (where 
b  <  m),  a  scenario  which  is  more  realistic  for  our  setting.  Slightly  abusing  notation,  let  B  be  the  b  x  m 
matrix  with  b\  s  as  rows.  We  define  the  Moore-Penrose  pseudo-inverse  of  B  as  B+  =  B'  (BBT)~  1 .  It  holds 
that  BB  =  4.  For  theoretical  analysis,  let  M  be  any  upper  bound  on  the  length  of  any  basis  path. 

Since  we  have  assumed  an  adaptive  adversary  that  produces  w,  based  on  our  previous  choices  x\ . .  .xt-\ 
as  well  as  the  random  factors  tt'  i  . . .  7tA,  .1,  we  should  take  care  in  dealing  with  expectations.  To  this  end,  let 
us  denote  the  conditional  expectation  E,  [A]  =  E  [A  \  i\ , . . . ,  it _  1 , 7XT 1 , . . . ,  7Xx',- 1  ] ,  keeping  in  mind  that  random¬ 
ness  at  time  t  in  the  general  model  stems  from  our  random  choice  i,  of  the  basis  path  and  the  adversary’s 
random  choice  nxt  given  it.  In  the  simplified  model,  all  randomness  is  due  to  our  choice  of  the  basis  path, 
and  this  makes  the  analysis  more  transparent.  We  stress  that  the  adversary  can  vary  the  distribution  of  nx, 
according  to  the  path  chosen  by  the  Player. 

4.2  Analysis  under  the  Simplified  Model 

We  now  analyze  the  effectiveness  of  GameTime  under  the  simplified  model  presented  in  Section  3.  We 
begin  by  proving  some  key  properties  of  the  algorithm. 

Preliminaries 

The  following  Lemma  is  key  to  proving  that  Algorithm  2  performs  well.  It  quantifies  the  deviations  of 
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our  estimates  of  the  delays  on  the  whole  graph,  wt,  from  the  true  delays  wt,  which  we  cannot  observe. 
Lemma  4.1  With  probability  at  least  1  —  8,  for  all  x  €  ‘P, 


-Yj^t-WtYx 
x  t=  1 


<  ^2c \J 2b  +  2ln(28~l) , 


(B) 


where  c  =  4 bM. 

Proof:  We  will  show  that  E(  w,x  =  wtx  for  any  x  e  tP,  i.e.  the  estimates  arc  unbiased3  on  the  subspace 
spanned  by  {b\.. . .  .b/,}.  By  working  directly  in  the  subspace,  we  obtain  the  required  probabilistic  statement 
and  will  have  the  dimensionality  of  the  subspace  b,  not  m,  entering  the  bounds. 

Define  vt  =  Bwt  just  as  vt  =  Bwt.  Taking  expectations  with  respect  to  it,  conditioned  on  q, . . . .  i, 

1  * 

EfVf  =  E,  [b(bjw,)  •  elV]  =  jY  b(bJwt) ' e/  =  Bwt  =  vf  • 

u  i—  1 

Fix  any  a  €  {— 2,2}b-  We  claim  that  the  sequence  Z],. . .  ,ZX,  where  Zt  =  aT(vf  —  vf)  is  a  bounded 
martingale  difference  sequence.  Indeed,  ErZ,  =  0  by  the  previous  argument.  A  bound  on  the  range  of  the 
random  variables  Z,  can  be  computed  by  observing 

|aTV(|  =  |aT[ft(ftxwf)e,-(]|  <  2b\bjwt  \  <  2 bM 


and 

|aTv(|  <  2 bM, 


implying 


\Zt\  <  4bM  =  c. 


An  application  of  Azuma-Hoeffding  inequality  (see  Appendix)  for  a  martingale  difference  sequence 
yields,  for  the  fixed  a, 

/  i 

T 


Pr 


E* 

t= i 


>  cy//2xln(2(2£')8'1)^  <  8/2*. 


Having  proved  a  statement  for  a  fixed  a,  we  would  like  to  apply  the  union  bound4  to  arrive  at  the 
corresponding  statement  for  any  a  e  [-2,  2]b.  This  is  implausible  as  the  set  is  uncountable.  However, 
applying  a  union  bound  over  the  vertices  of  the  hypercube  {—2,2}*  is  enough.  Indeed,  if  |£/=1Zf|  = 
a 1  Yf- 1  ( k  —  vt )  |  <  q  for  all  vertices  of  {—2,2}*,  then  immediately  \If=lZ,\  <  t,  for  any  a  €  [—2,2]*  by 
lineality.  Thus,  by  union  bound. 


Pr  fva  e 


-2,2]' 


^aT(v(-v() 


t= t 


<c\2xb  +  2x\n{2h~Y  >1-8 


Any  path  v  can  be  written  as  xT  =  a T B  for  some  a  €  [—2,2]*.  Furthermore,  wt  =  B+vt  implies  that 
xTwt  =  aTBB+vt  =  aTvt  and  ,rTvv,  =  aJv,.  We  conclude  that 

3For  random  variables  X  and  X,  X  is  said  to  be  an  unbiased  estimate  of  X  if  E[V  —  X]  =0. 

4A1so  known  as  Boole’s  inequality,  the  union  bound  says  that  the  probability  that  at  least  one  of  the  countable  set  of  events 
happens  is  at  most  the  sum  of  the  probabilities  of  the  events,  e.g.  Pr(AUB)  <  Pr(A)  +Pr(B). 
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£  +  2^ 


path  length 


Figure  4:  Illustration  of  the  second  inclusion  in  Lemma  4.2.  The  set  of  e-longest  paths,  the  object  of  interest,  is 
contained  in  the  set  of  (e  +  2^)-longest  paths  w.r.t.  to  the  sequence  w\, . . . ,  wx.  Under  a  margin  assumption,  equality  between  the 
two  sets  can  be  shown,  as  exhibited  by  Theorem  4.2. 


Pr  f  Vx  G  f , 


52  (w?  -  wty 


t= i 


<cJ2xb  +  2xln(2d~l)  >1-8 


and  the  statement  follows  by  dividing  by  x. 

□ 

Estimating  the  Set  of  Longest  Paths 

With  the  help  of  Lemma  4. 1,  we  can  now  analyze  how  the  longest  (or  almost-longest)  paths  with  respect 
to  the  estimated  wf s,  where  path  lengths  are  averaged  over  all  rounds,  compare  to  the  true  averaged  longest 
paths. 


Definition  4.1  Define  the  set  of  e-longest  paths  with  respect  to  the  actual  delays 


[  1  _E  1  T 

55  =  <x  G  tP  :  -Y  wjx  >  max  -  Y  wjx'  —  £ 

I  T  L~‘  x’CiT  X 


‘  t=  1 


t=  1 


and  with  respect  to  the  the  estimated  delays 


wjx  >  max- 

x’€T  x 


T 


52  wjx 

t= i 


In  particular,  S®  is  the  set  of  longest  paths. 


The  following  Lemma  makes  our  intuition  precise:  with  enough  trials  x,  the  set  of  longest  paths,  which 
we  can  calculate  after  running  Algorithm  2,  becomes  almost  identical  to  the  true  set  of  longest  paths.  We 
illustrate  this  point  graphically  in  Figure  4:  In  a  histogram  of  average  path  lengths,  the  set  of  longest  paths 
(the  right  “bump”)  is  somewhat  smoothed  when  considering  the  path  lengths  under  the  estimated  wf’s.  In 
other  words,  paths  might  have  a  slightly  different  average  path  length  under  the  estimated  and  actual  weights. 
However,  we  can  still  guarantee  that  this  smoothing  becomes  negligible  for  large  enough  x,  enabling  us  to 
locate  the  longest  paths. 
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Lemma  4.2  For  any  £  >  0  and  for  £  =  x  l/24bM \j2b  +  21n(28  1), 

5X  C  5x+2^  am/  5XE  C  JTe+2^ 


with  probability  at  least  1 
Proof:  Let  x  £  5X  and  y  6 


-5. 

5X.  Suppose  that  we  are  in  the  (1  —  8) -probability  event  of  Lemma  4.1.  Then 

xr=l  Tf=l  *€!pTr=l 


1  f=l 


'  f=l 


1  T 

=  max  -  Y\  wjx'  —  e  —  2£, 
yes’  T  f““[ 


where  the  first  and  fourth  inequalities  follow  by  Lemma  4.1,  the  third  inequality  is  by  definition  of  maximum, 
and  the  second  and  fifth  arc  by  definitions  of  5X  and  5X,  resp.  Since  the  sequence  of  inequalities  holds  for 

~  ~  8+2E 

any  x  G  5X ,  we  conclude  that  5X  C  sx  ■  The  other  direction  of  inclusion  is  proved  analogously.  □ 

Note  that  t,  — >  0  as  x  — >  °o.  To  compute  the  set  we  can  instead  compute  the  set  Sx  that  contains 
it.  If  \s?\  <  k,  for  some  parameter  k,  then  we  can  use  an  algorithm  that  computes  the  k  longest  paths  (see, 
e.g.,  [7])  to  find  this  set. 

Results  under  Unique  Longest  Path  Assumption 

While  Lemma  4.2  is  very  general,  we  now  give  one  interesting  implication  for  finding  a  longest  path 
under  the  following  assumption. 


Assumption  4.1  There  exists  a  single  path  x*  that  is  the  longest  path  on  any  round  with  a  certain  (known) 
margin  p: 

Vx  G  P,  x  /  x* ,  Vf ,  (x*  —  x)Twt  >  p 

Note  that  if  there  is  a  unique  longest  path  (for  any  margin  p  >  0),  then  we  can  see  that 

1  T 

x*  =  a rg  max  -  V  w;Tx  =  a rg  max  max  vv;Tx 
xet  T  xev  r=l..x 

Thus,  under  the  above  margin  assumption,  we  can,  in  fact,  recover  the  longest  path,  as  shown  in  the  next 
Theorem. 


Theorem  4.2  Suppose  Assumption  4.1  holds  with  p  >  0.  We  run  the  Algorithm  2  for  X  =  (SbM)2p  2(lb  + 
2  In (25  1 ) )  iterations.  Then  with  probability  at  least  1  —  8,  Algorithm  2  outputs 


X 


x*  :=  arg  max  xT 

XET 


t= 1 


and  xx  is  equal  to  x*. 
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Proof: 

Let  xx  =  argmaxxei,  xYf_\  wt.  We  claim  that,  with  probability  1  —  5  it  is  equal  to  x*.  Indeed,  suppose 
x*  f  x*.  By  Lemma  4.2,  x*  <£  5X  C  s?  with  probability  at  least  1  —  5.  Thus, 


Assumption  4.1,  however,  implies  that 


leading  to  a  contradiction  whenever  p  >  2q  =  x  1  2 SbM \/2b +  2\n{2b  1 ) ) .  Rearranging  the  terms,  we 
arrive  at  t  >  ( 8A/V/)2 p  2  (2/?  +  2  In  (25  ')),  as  assumed.  We  conclude  that  with  probability  at  least  1  —  5, 
x*  =  x*  and  {x*}  =  5X°  =  S^. 

□ 

The  following  weaker  assumption  also  has  interesting  implications. 

Assumption  4.2  There  exists  a  path  x*  6  tP  such  that  it  is  the  longest  path  on  any  round 

Vx  €  P,  Vf,  (x*  —  x)Jwt  >  0 


If,  after  running  Algorithm  2  for  enough  iterations,  we  find  all  2q-longcst  paths  (the  set  5X  ),  Lemma  4.2 
guarantees  that,  under  Assumption  4.2,  the  longest  path  x*  €  5X  is  one  of  them  with  high  probability.  As 
discussed  earlier,  we  can  use  an  efficient  /^-longest  paths  computation  to  find  a  set  containing  5X.  We  can 
then  use  this  information  to  repeatedly  test  the  candidate  paths  in  this  set  to  find  the  worst-case  path  and 
estimate  its  length. 

4.3  Analysis  under  General  Weights-Perturbation  Model 

We  now  present  an  analysis  of  GameTime  under  the  general  weight-perturbation  model  given  in  Sec.  3. 
For  easy  reference,  we  give  the  GameTime  algorithm  again  below  but  with  the  new  environment  model 
Algorithm  3. 

As  before,  let  M  be  any  upper  bound  on  the  length  of  any  basis  path  (where  the  length  includes  the 
perturbation). 

In  the  general  model,  the  environment  •£  picks  a  distribution  with  mean  pxt  6  Mm,  which  depends  on  the 
algorithm's  chosen  path  .r.  From  this  distribution,  i:  draws  a  vector  of  perturbations  Tix,  <G  M"!.  The  vector 
Kxt  satisfies  the  following  assumptions: 

•  Bounded  perturbation: 

\\TZxt  ||  i  <  N,  where  A  is  a  parameter. 

•  Bounded  mean  perturbation  of  path  length: 

For  any  path  v  G  T ,  \xJ pKt  <  qmax 

Note  that  pxt  is  a  function  of  the  chosen  path,  and  that  nx,  depends  on  pxt. 

We  now  state  the  main  lemma  for  our  general  model.  In  this  case,  we  calculate  wt  as  an  estimate  of  the 

sum  vv,  +  Kxt . 
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Algorithm  3  GameTime  with  general  environment  model 
1 :  Input  x  G  N 

2:  Compute  a  2-barycentric  spanner  {b\ , . . . , b/,} 

3:  for  t  =  1  to  x  do 
4:  Environment  chooses  wt. 

5:  We  choose  it  G  {1, . . ,  ,b}  uniformly  at  random. 

6:  Environment  chooses  a  distribution  from  which  to  draw  nxt,  where  the  mean  //,  and  support  of  the 

distribution  satisfies  the  assumptions  given  above. 

7:  We  predict  the  path  x,  =  bjt  and  observe  the  path  length  £t  =  b]t (wt  +  nxt) 

8:  Estimate  vt  G  Mh  as  vt  =  b£t  ■  e,-f,  where  { e,- }  denotes  the  standard  basis. 

9:  Compute  estimated  weights  wt  =  B+vt 

10:  end  for 

11:  Use  the  obtained  sequence  uq.. .  w%  to  find  a  longest  path(s).  For  example,  for  Theorem  4.4,  we  compute 
x*  :  =  arg  maxrGS,  xT  ,  w, . 


Lemma  4.3  With  probability  at  least  1  —  8,  for  all  x  G  tP, 


■  £(> vt-wt-  t ) T x  <  (lb  +  l)pn 


1 1=\ 


+x 


-1/2 


cy2f?  +  21n(48-1)  +  rC/2m  +  21n(4S-1 


(9) 


where  c  =  2b(2M  +  pmax)  and  d  =  N  +  pmax- 

Proof:  The  proof  is  similar  to  that  of  Lemma  4. 1,  so  we  only  highlight  the  differences  here. 


E,v,  =  E,-(  {Erc.v,  [b(b]wt  +  b]nxt  (xf))  •  e,v \it] } 
=  b(b]wt)  ■  e,-  +  ^ b(bjp^)ti 
=  Bw, 


where  p]’ds,s  denotes  the  h  x  1  vector  of  means  in  which  the  ith  element  is  bj pf  and  each  entry  is  bounded 
in  absolute  value  by  pmax. 

Fix  any  a  G  {— 2,2}b.  As  before,  the  sequence  Z\,.. .  ,ZX,  where  Z,  =  aT(vt  —  vt  —  q;basis)  is  a  bounded 
martingale  difference  sequence.  A  bound  on  the  range  of  the  random  variables  can  be  computed  by  observ¬ 
ing 

|aTv(|  =  I  aT  [b(bjt  (w,  +  7Tvr )  )e,-(  ]  | 

and 

I .basis |  /  ol.. 

|  ft t  I  — 

implying 

\Zt\  <  2b(2M  +qmax)  =  c. 


<2b\bl(wt  +  Kx,)\<2bM 
|ocTv(|  <  2bM 


17 


Thus,  using  Azuma-Hoeffding  inequality,  we  can  conclude  that  for  any  Si  >  0,  and  for  fixed  a. 


Pr  £zf  >cJ2xln(2(2fo)5r1)  <8,/2' 


and  (skipping  a  few  intermediate  steps  involving  the  union  bound  as  before),  we  finally  get 


Pr  Vx£!P,  —  wt)Tx  <  2Zrciumax +cy  2xZ>  +  2xln(281  *)  j  >  1  —  81. 


Now  consider  any  fixed  x  £  {0,  l}m.  We  claim  that  the  sequence  K| .. . . .  Yx,  where  Y,  =  xJK't (x)  —  xTjuxt 
is  also  a  bounded  martingale  difference  sequence.  Clearly,  since  E,  \k\  (jt)]  =  /j*t,  E,  [F,]  =  0.  Further,  a 
bound  on  the  range  of  the  random  variables  can  be  computed  by  observing 

| xJnxt (x)\<N  and  | xT/ft \  <  qmax . 


|Pf|  E  N  T"  jJ max  — •  d. 

An  application  of  Azuma-Hoeffding  inequality  for  the  fixed  x  and  for  any  82  >  0  yields. 


Pr  £y,  >dJ 2xln(2(2-)821)  <S2/2' 


Taking  the  union  bound  over  all  x  £{0,1}"\ 


Pr  Mx  £  {0, 1  }m ,  ^ xT (7 ixt (x)  <dJ 2xm  +  2x ln(282  1 )  j  >  1  —  82  . 


Thus,  we  get 


Pr  (  Mx  £  {0,  l}m,  ^xTKxt(x)  <  ^ xt/jx1  +dJ2vn  +  2xln(282  ')  )  >  1  —  8: 


and  finally 


Pr  Y/x  £  {0,  l}m,  £jcVf(*)  <  xqmaY  +  dJ 2xm  +  2xln(2S2  *)  j  >  1  —  8?  . 


Setting  81  =  82  =  7  in  Relations  10  and  1 1  above  and  dividing  them  throughout  by  x,  we  get  that,  for  all 
x  £  tP ,  each  of  the  following  two  inequalities  hold  with  probability  at  most  | : 


j^xT(wt-wt)  >  2b^max  +  x  1/2cy/2£  +  21n(48  *) 


-  £xt^((x)  >  Pmax  +  X  1/2dy//2m  +  21n(48 
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From  the  above  relations,  we  can  conclude  that,  for  al I  x  6  tP .  the  following  inequality  holds  with  probability 
at  least  1  —  5 


YJxT{wt-wt-nxt{x)) 

f=t 


<  (2b  +  1  )/rmax  + 


~1/2  (c^2b  +  2ln(48~l)+d^2tn  +  2ln(48- 


(12) 


which  yields  the  desired  lemma. 

□ 

Estimating  Longest  Paths 

From  Lemma  4.3,  we  can  derive  results  on  estimating  the  e-longest  paths  and  the  longest  path  in  a 
manner  similar  to  that  employed  in  Section  4.2.  The  main  difference  is  that  now  we  view  £rT=1  w,  as  an 
estimate  of  YYt=\  ( wi  +  Tt?t)  rather  than  of  simply  wc 

Thus,  we  now  define  the  set  5®  as 

Definition  4.3  Define  the  set  of  e-longest  paths  with  respect  to  the  actual  delays 

5X  =  <  x  G  tP  :  -  V*  (wt  +  it? t)T x  >  max  -  Y\  (wt  +  Kx,  (x)Yx  —  e 

l  Tf~: !  x'e'1’  x 


The  definition  of  the  set  S-(  stays  unchanged. 

The  lemma  on  approximating  the  sets  S  by  S  now  becomes  the  following: 

Lemma  4.4  For  any  e  >  0  and  for 


—  (2b  +  1  )Pmax  T  X 


-1/2 


:  sj  2b  +  2  In  (48~ : 1 )  +  d  yj  2m  +  2  In  (45- 


we  have 

5X  C  5tE+2^  and  5XE  C  5Te+2^ 

with  probability  at  least  1  —  5. 

Under  the  margin  assumption  (Assumption  4. 1),  we  can  recover  the  longest  path  in  the  general  weight- 
perturbation  model,  using  an  identical  reasoning  as  before. 

Theorem  4.4  Suppose  Assumption  4.1  holds  with  p  >  (4b  +  2)pmax.  We  run  Algorithm  3  for  x  =  8(p  — 
(4b  +  2)pmax)~2  (c2(b  +  ln(4S-1))  +  d2(m  +  ln(45-1)))  iterations. 

Then  with  probability  at  least  1—5,  Algorithm  3  outputs 

T 

x*  :  =  arg  max  xT  V  wt 

vd  CD 


and  x*  is  equal  to  x*. 

The  proofs  of  Lemma  4.4  and  Theorem  4.4  arc  virtually  identical  to  the  corresponding  results  in  Sec¬ 
tion  4.2,  so  we  omit  them  here.  Also,  as  in  that  section,  we  can  also  identify  the  longest  path  under  the 
weaker  Assumption  4.2  by  finding  the  set  containing  5X  and  enumerating  the  paths  in  it. 
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5  Experimental  Results 


We  have  implemented  and  evaluated  our  approach  for  problems  in  execution  time  analysis.  Our  analysis 
tool,  called  GameTime,  can  generate  an  estimate  of  the  execution  time  profile  of  the  program  as  well  as  a 
worst-case  execution  time  estimate.  This  section  details  our  implementation  and  results. 

5.1  Implementation 

GameTime  operates  in  four  stages,  as  described  below. 

1.  Extract  CFG.  GameTime  begins  by  extracting  the  control-flow  graph  (CFG)  of  the  real-time  task  whose 
WCET  must  be  estimated.  This  part  of  GameTime  is  built  on  top  of  the  CIL  front  end  for  C  [8].  Our  CFG 
parameters  (numbers  of  nodes,  edges,  etc.)  is  thus  specific  to  the  CFG  representations  constructed  by  CIL. 
In  general,  nodes  correspond  to  the  start  of  basic  blocks  of  the  program  and  edges  indicate  flow  of  control, 
with  edges  labeled  by  a  conditional  or  basic  block.  In  our  experience,  this  phase  is  usually  fast,  taking  no 
more  than  a  minute  for  any  of  our  benchmarks. 

2.  Compute  basis  paths.  The  next  step  for  GameTime  is  to  compute  the  set  of  basis  paths  and  the  B+ 
matrix.  This  is  done  essentially  as  discussed  in  Section  4,  where  we  also  ensure  the  feasibility  of  basis  paths 
by  the  use  of  integer  programming  and  SMT  solving.  This  phase  can  be  somewhat  time-consuming;  in  our 
experiments,  the  basis  computation  for  the  largest  benchmark  (statemate)  took  about  15  minutes. 

3.  Generate  program  inputs.  Given  the  set  of  basis  paths  for  the  graph,  GameTime  then  has  to  generate 
inputs  to  the  program  that  will  drive  the  program's  execution  down  that  path.  It  does  this  using  constraint- 
based  test  generation,  by  generating  a  constraint  satisfaction  problem  characterizing  each  basis  path,  and 
then  using  a  constraint  solver  based  on  Boolean  satisfiability  (SAT).  This  phase  uses  the  UCLID  decision 
procedure  [3]  to  generate  inputs  for  each  path  and  creates  one  copy  of  the  program  for  each  path,  with  the 
different  copies  only  differing  in  their  initialization  functions.  For  our  experiments,  this  constraint-based 
test  generation  phase  was  very  quick,  taking  less  than  a  minute  for  each  benchmark. 

4.  Predict  estimated  weight  vector  or  longest  path.  Finally,  Algorithm  2  is  run  with  the  set  of  basis  paths 
and  their  corresponding  programs,  along  with  the  B  matrix.  The  number  of  iterations  in  the  algorithm,  x, 
depends  on  the  mode  of  usage  of  the  tool.  In  the  experiments  reported  below,  we  used  a  deterministic  cycle- 
accurate  processor  simulator,  and  hence  x  was  set  equal  to  b,  since  we  perform  one  simulation  per  basis 
path.  In  general,  x  can  be  pre-computed  as  described  in  Section  4  or  increased  gradually  while  searching  for 
convergence  to  a  single  longest  path. 

The  run-time  for  this  phase  depends  on  the  execution  time  of  the  program  and  the  number  of  iterations  of 
the  loop  in  Algorithm  2;  for  our  experiments,  this  run-time  was  under  a  minute  for  all  benchmarks. 

Given  the  estimated  weights  computed  at  each  round,  vFi ,  vi>2,  •  •  • ,  vvT,  we  can  compute  the  overall  esti¬ 
mated  weight  vector  \  Y?t=\  wv>  and  use  this  to  predict  the  length  of  any  path  in  the  program.  In  particular, 
we  can  predict  the  longest  path,  and  its  corresponding  length.  Alternatively,  the  predicted  longest  path  can 
be  executed  (or  simulated)  several  times  to  calculate  the  desired  timing  estimate. 

5.2  Benchmarks 

Our  benchmarks  were  drawn  from  those  used  in  the  WCET  Challenge  2006  [23],  which  were  drawn  from 
the  Malardalen  benchmark  suite  [15]  and  the  PapaBench  suite  [18].  In  particular,  we  used  benchmarks  that 
came  from  real  embedded  software  (as  opposed  to  toy  programs),  had  non-trivial  control  flow,  and  did  not 
require  automatic  estimation  of  loop  bounds.  The  latter  criterion  ruled  out,  for  example,  benchmarks  that 
compute  a  discrete  cosine  transform  or  perform  data  compression,  because  there  is  usually  just  one  path 
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through  those  programs  (going  through  several  iterations  of  a  loop),  and  variability  in  run-time  usually  only 
comes  from  characteristics  of  the  data.  Most  benchmarks  in  the  Malardalen  suite  arc  of  this  nature. 

The  main  characteristics  of  the  chosen  benchmarks  is  shown  in  Table  1.  The  first  three  benchmarks, 
altitude,  stabilisation,  and  climb_control,  arc  tasks  in  the  open  source  PapaBench  software  for  an  unmanned 
aerial  vehicle  (UAV)  [18].  The  last  benchmark,  statemate,  is  code  generated  from  a  Statemate  Statecharts 
model  for  an  automotive  window  control  system.  Note  in  particular,  how  the  number  of  basis  paths  b  is  far 
less  than  the  total  number  of  source-sink  paths  in  the  CFG.  (We  are  able  to  efficiently  count  the  number  of 
paths  as  the  CFG  is  a  DAG.)  We  also  indicate  the  number  of  lines  of  code  for  each  task;  however,  note  that 
this  is  an  imprecise  metric  as  it  includes  declarations,  comment  lines,  and  blank  lines  -  the  CFG  size  is  a 
more  accurate  representation  of  size. 


Name 

LOC 

Size  of  CFG 

Total  Num. 
of  paths 

Num.  of  basis 
paths  b 

n 

m 

altitude 

12 

12 

16 

11 

6 

stabilisation 

48 

31 

39 

216 

10 

climb_control 

43 

40 

56 

657 

18 

statemate 

916 

290 

471 

7x  1016 

183 

Table  1 :  Characteristics  of  Benchmarks.  “LOC”  indicates  number  of  lines  of  C  code  for  the  task.  The  Control-Flow 
Graph  (CFG)  is  constructed  using  the  CIL  front  end,  n  is  the  number  of  nodes,  m  is  the  number  of  edges. 


5.3  Worst-Case  Execution  Time  Analysis 

We  have  compared  GameTime  with  leading  tools  for  WCET  analysis.  We  present  here  a  comparison 
with  Chronos  [13].  These  tools  are  based  on  models  crafted  for  particular  architectures,  and  are  designed 
to  generate  conservative  (over-approximate)  WCET  bounds.  Although  GameTime  is  not  guaranteed  to 
generate  an  upper  bound  on  the  WCET,  we  have  found  that  GameTime  can  produce  larger  WCET  estimates 
than  these  tools.  We  also  show  that  GameTime  does  significantly  better  than  simply  testing  the  programs 
with  inputs  generated  uniformly  at  random. 

5.3.1  Comparison  with  Chronos  and  Random  Testing 

We  performed  experiments  to  compare  GameTime  against  Chronos  [13]  as  well  as  against  testing  the 
programs  on  randomly-generated  inputs.  WCET  estimates  are  output  in  terms  of  the  number  of  CPU  cycles 
taken  by  the  task  to  complete  in  the  worst-case. 

Chronos  is  built  upon  SimpleScalar  [25],  a  widely-used  tool  for  processor  simulation  and  performance 
analysis.  Chronos  extracts  a  CFG  from  the  binary  of  the  program  (compiled  for  MIPS  using  modified 
SimpleScalar  tools),  and  uses  a  combination  of  dataflow  analysis,  integer  programming,  and  manually  con¬ 
structed  processor  behavior  models  to  estimate  the  WCET  of  the  task. 

To  compare  GameTime  against  Chronos,  we  used  SimpleScalar  to  simulate,  for  each  task,  each  of  the 
extracted  basis  paths.  We  used  the  same  SimpleScalar  processor  configuration  as  we  did  for  Chronos  (which 
is  Chronos’  default  configuration),  specified  below: 

-cache: ill  ill : 16 : 32 : 2 : 1  -mem:lat  30  2  -bpred  21ev  -bpred:21ev  1  128  2  1  -decode : width  1  -issue: width 
1  -commit : width  1  -fetch: if qsize  4  -ruu:size  8 

Since  SimpleScalar’s  execution  is  deterministic  for  a  fixed  processor  configuration,  we  did  not  run  Al¬ 
gorithm  2  in  its  entirety.  Instead,  we  simulated  each  of  the  basis  paths  exactly  once  (factoring  out  the  time 
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for  initialization  code)  and  then  predicted  the  longest  path  as  described  in  Section  4.  The  predicted  longest 
path  was  then  simulated  once  and  its  execution  time  is  reported  as  GameTime’s  WCET  estimate. 

The  random  testing  was  done  by  generating  initial  values  for  each  program  input  variable  uniformly 
at  random  from  its  domain.  For  each  benchmark,  we  generated  500  such  random  initializations;  note  that 
GameTime  performs  significantly  fewer  simulations  (only  as  many  as  there  arc  basis  paths,  for  a  maximum 
of  183  for  the  statemate  benchmark). 


Name  of 

Benchmark 

Chronos 

WCET 

Tc 

Random 

testing 

Tr 

GameTime 

estimate 

T* 

Tc~Tg 

Tg 

(%) 

Basis  path 
times 

Max 

Min 

altitude 

567 

175 

348 

62.9 

343 

167 

stabilisation 

1379 

1435 

1513 

-8.9 

1513 

1271 

climb_control 

1254 

646 

952 

31.7 

945 

167 

statemate 

8584 

4249 

4575 

87.6 

3735 

3235 

Table  2:  Comparison  with  Chronos  and  random  testing.  Execution  time  estimates  are  in  number  of  cycles  reported 
by  SimpleScalar.  For  random  testing,  the  maximum  cycle  count  over  500  runs  is  reported.  The  fifth  column  indicates  the  percentage 
over-estimation  by  Chronos  over  GameTime,  and  the  last  two  columns  indicate  the  maximum  and  minimum  cycle  counts  for  basis 
paths  generated  by  GameTime. 

Our  results  are  reported  in  Table  2.  We  note  that  the  estimate  of  GameTime  Tg  is  lower  than  the  WCET 
Tc  reported  by  Chronos  for  three  out  of  the  four  benchmarks.  Interestingly,  Ts>  Tc  for  the  stabilisation 
benchmark;  on  closer  inspection,  we  found  that  this  occurred  mainly  because  the  number  of  misses  in 
the  instruction  cache  was  significantly  underestimated  by  Chronos.  The  over-estimation  by  Chronos  for 
statemate  is  very  large,  much  larger  than  for  altitude  and  climb_control.  This  appears  to  arise  from  the  fact 
that  the  number  of  branch  mis-predictions  estimated  by  Chronos  is  significantly  larger  than  that  actually 
occurring:  106  by  Chronos  versus  19  mis-predictions  on  the  longest  path  simulated  by  GameTime  in 
SimpleScalar.  In  fact,  the  number  of  branches  performed  in  a  single  loop  of  the  statemate  code  is  bounded 
by  approximately  40. 

We  also  note  that  GameTime’s  estimates  can  be  significantly  higher  than  those  generated  by  random 
testing.  Moreover,  GameTime’s  predicted  WCET  is  higher  than  the  execution  time  of  any  of  the  basis 
paths,  indicating  that  the  basis  paths  taken  together  provide  more  longest  path  information  than  available 
from  them  individually. 

5.4  Estimating  the  Full  Timing  Profile  of  a  Program 

One  of  the  unique  aspects  of  GameTime  is  the  ability  to  predict  the  execution  time  profile  of  a  program  - 
the  distribution  of  execution  times  over  program  paths  -  as  formalized  in  Lemma  4.3. 

To  experimentally  validate  this  ability,  we  performed  experiments  with  a  complex  processor  architecture 
-  the  StrongARM-1 100  -  which  implements  the  ARM  instruction  set  with  a  complex  pipeline  and  both  data 
and  instruction  caches.  The  Simlt-ARM  cycle-accurate  simulator  [19]  was  used  in  these  experiments. 

In  our  experiments,  we  first  executed  each  basis  path  generated  by  GameTime  on  the  Simlt-ARM  sim¬ 
ulator  and  generated  the  averaged  estimated  weight  vector  wavg  =  r  Lf_i  wt.  Using  this  estimated  weight 
vector  as  the  weights  on  edges  in  the  CFG,  we  then  efficiently  computed  the  estimated  length  of  each  path  x 
in  the  CFG  as  x  ■  wavg  using  dynamic  programming.  We  also  exhaustively  enumerated  all  program  paths  for 
the  small  programs  in  our  benchmark  set,  and  simulated  each  of  these  paths  to  compute  its  execution  time. 
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For  the  altitude  program,  the  histogram  of  execution  times  generated  by  GameTime  perfectly  matched 
the  true  histogram  generated  by  exhaustively  enumerating  program  paths. 

For  the  climb_control  task,  the  GameTime  histogram  is  a  close  match  to  the  true  histogram,  as  can 
be  seen  in  Figure  5.  Out  of  a  total  of  657  paths,  129  were  found  to  be  feasible;  of  these,  GameTime’s 
prediction  differs  from  the  true  execution  time  on  only  12  paths,  but  the  prediction  is  never  off  by  more  than 
20  cycles. 


Execution  time  (cycle  count) 


Figure  5:  Estimating  the  distribution  of  execution  times  with  GameTime.  The  true  execution  times  are 
indicated  by  white  bars,  the  predicted  execution  times  by  gray  bars,  and  the  cases  where  the  two  coincide  are  colored  black. 

In  summary,  we  have  found  GameTime  to  be  an  adequate  technique  to  estimate  not  just  the  WCET, 
but  also  the  distribution  of  execution  times  of  a  program,  for  even  complex  microprocessor  platforms.  A 
key  aspect  of  GameTime’s  effectiveness  has  been  the  generation  of  tests  for  basis  paths.  We  have  also 
experimented  with  other  coverage  metrics  such  as  statement  coverage,  but  these  do  not  yield  the  same  level 
of  accuracy  as  do  basis  path  coverage.  Full  path  coverage  is  very  difficult  to  achieve  for  programs  that 
exhibit  path  explosion  (e.g.,  statemate),  while  the  number  of  basis  paths  remains  tractable. 

6  Conclusions 

We  have  presented  a  new,  game-theoretic  approach  to  estimating  quantitative  properties  of  a  software 
task,  such  as  its  execution  time  profile  and  worst-case  execution  time  (WCET).  Our  tool,  GameTime, 
is  measurement-based,  making  it  easy  to  use  on  many  different  platforms  without  the  need  for  tedious  pro¬ 
cessor  behavior  analysis.  We  have  presented  both  theoretical  and  experimental  results  for  the  utility  of  the 
GameTime  approach  for  quantitative  analysis,  in  particular  for  timing  estimation. 

We  note  that  our  algorithm  and  results  of  Section  4  are  general,  in  that  they  apply  to  estimating  longest 
paths  in  DAGs  in  an  unpredictable  environment,  not  just  to  timing  estimation  for  embedded  software.  One 
could  apply  the  algorithms  presented  in  this  paper  to  quantitative  analysis  of  many  systems  with  suitable 
graph  models.  Several  potential  applications  are  worth  exploring,  including  timing  analysis  of  combinational 
circuits  and  distributed  embedded  and  control  systems,  as  well  as  power  estimation  of  embedded  systems. 
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A  Azuma-Hoeffding  Inequality 


The  Azuma-Hoeffding  inequality  is  a  very  useful  concentration  inequality.  A  version  of  this  inequality  with 
a  slightly  better  constant  is  given  as  Lemma  A.7  in  [4]. 


Lemma  A.l  Let  Y\. ...  .Yj  be  a  martingale  difference  sequence.  Suppose  that  Y,  <  c  almost  surely  for  all 
l  G  {  I .....  X }.  Then  for  any  8  >  0, 


Pr 


I* 

t= l 


>  \/2xc2log(2/8)  <8 


One-sided  inequalities  for  Y*=i  Yt  also  hold  by  replacing  2/8  with  1/8  in  the  logarithm.  The  inequality 
is  an  instance  of  the  so-called  concentration  of  measure  inequalities.  Roughly  speaking,  it  says  that  if  each 
random  variable  fluctuates  within  the  bounds  [— c,c],  then  the  sum  of  these  variables  fluctuates,  with  high 
probability,  within  [—cffz,cffz\. 
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